less than the number of groups.

The denominator degrees of freedom: This number is designated as

or

, which is the total

number of observations minus the number of groups.

The p value can be calculated from the values of F,

, and

, and the software performs this

calculation for you. If the p value from the ANOVA is statistically significant — less than 0.05 or your

chosen α level — then you can conclude that the group means are not all equal and you can reject the

null hypothesis. Technically, what that means is that at least one mean was so far away from another

mean that it made the F test result come out far away from 1, causing the p value to be statistically

significant.

Picking through post-hoc tests

Suppose that the ANOVA is not statistically significant (meaning F was larger than 0.05). It means that

there is no point in doing any t tests, because all the means are close to each other. But if the ANOVA

is statistically significant, we are left with the question: Which group means are higher or lower than

others? Answering that question requires us to do post-hoc tests, which are t tests done after an

ANOVA (post hoc is Latin for “after this”).

Although using post-hoc tests can be helpful, controlling Type I error is not that easy in reality. There

can be issues with the data that may make you not trust the results of your post-hoc tests, such having

too many levels to the group you are testing in your ANOVA, or having one or more of the levels with

very few participants (so the results are unstable). Still, if you have a statistically significant ANOVA,

you should do post-hoc t tests, just so you know the answer to the question stated earlier.

It’s okay to do these post-hoc tests; you just have to take a penalty. A penalty is where you

deliberately make something harder for yourself in statistics. In this case, we take a penalty by

making it deliberately harder to conclude a p value on a t test is statistically significant. We do

that by adjusting the α to be lower than 0.05. How much we adjust it depends on the post-hoc test

we choose.

The Bonferroni adjustment uses this calculation to determine the new, lower alpha: α/N, where

N is the number of groups. As you can tell, the Bonferroni adjustment is easy to do manually! In the

case of our three marital groups (M, NM, and OTH), our adjusted Bonferroni α would be 0.05/3,

which is 0.016. This means that for a post-hoc t test of average fasting glucose between two of the

three marital groups, the p value would not be interpreted as significant unless the it was less than

0.016 (which is a tougher criterion than only having to be less than 0.05). Even though the

Bonferroni adjustment is easy to do by hand, because most analysts use statistical packages when

doing these calculations, it is not used very often in practice.

Tukey’s HSD (“honestly” significant difference) test adjusts α in a different way than

Bonferroni. It is intended to be used when there are equally-sized groups in each level of the

variable (also called balanced groups).

The Tukey-Kramer test is a generalization of the original Tukey’s HSD test to designed to handle

different-sized (also called unbalanced) groups. Since Tukey-Kramer also handles balanced